AITopics | anchor box

Collaborating Authors

anchor box

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MetaAnchor: Learning to Detect Objects with Customized Anchors

Tong Yang, Xiangyu Zhang, Zeming Li, Wenqiang Zhang, Jian Sun

Neural Information Processing SystemsFeb-13-2026, 02:16:24 GMT

Neural Information Processing Systems http://nips.cc/

anchor box, anchor function, metaanchor, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > China (0.04)

Genre: Research Report (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

MetaAnchor: Learning to Detect Objects with Customized Anchors

Tong Yang, Xiangyu Zhang, Zeming Li, Wenqiang Zhang, Jian Sun

Neural Information Processing SystemsNov-20-2025, 17:08:15 GMT

There are a few recent studies on the topic, such as [33, 37].

artificial intelligence, machine learning, metaanchor, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > China (0.04)

Genre: Research Report (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

A Self Validation Network for Object-Level Human Attention Estimation

Zehua Zhang, Chen Yu, David Crandall

Neural Information Processing SystemsOct-2-2025, 14:42:05 GMT

Due to the foveated nature of the human vision system, people can focus their visual attention on only a small region of their visual field at a time, which usually contains a single object.

computer vision, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Indiana (0.04)
North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)
(2 more...)

Add feedback

A Data-Driven RetinaNet Model for Small Object Detection in Aerial Images

Tang, Zhicheng, Tang, Jinwen, Shang, Yi

arXiv.org Artificial IntelligenceSep-4-2025

In the realm of aerial imaging, the ability to detect small objects is pivotal for a myriad of applications, encompassing environmental surveillance, urban design, and crisis management. Leveraging RetinaNet, this work unveils DDR-Net: a data-driven, deep-learning model devised to enhance the detection of diminutive objects. DDR-Net introduces novel, data-driven techniques to autonomously ascertain optimal feature maps and anchor estimations, cultivating a tailored and proficient training process while maintaining precision. Additionally, this paper presents an innovative sampling technique to bolster model efficacy under limited data training constraints. The model's enhanced detection capabilities support critical applications including wildlife and habitat monitoring, traffic flow optimization, and public safety improvements through accurate identification of small objects like vehicles and pedestrians. DDR-Net significantly reduces the cost and time required for data collection and training, offering efficient performance even with limited data. Empirical assessments over assorted aerial avian imagery datasets demonstrate that DDR-Net markedly surpasses RetinaNet and alternative contemporary models. These innovations advance current aerial image analysis technologies and promise wide-ranging impacts across multiple sectors including agriculture, security, and archaeology.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.02928

Country: North America > United States > Missouri (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.71)

Add feedback

Clustering-based Feature Representation Learning for Oracle Bone Inscriptions Detection

Tao, Ye, Fu, Xinran, Pang, Honglin, Yang, Xi, Li, Chuntao

arXiv.org Artificial IntelligenceAug-27-2025

Oracle Bone Inscriptions (OBIs), play a crucial role in understanding ancient Chinese civilization. The automated detection of OBIs from rubbing images represents a fundamental yet challenging task in digital archaeology, primarily due to various degradation factors including noise and cracks that limit the effectiveness of conventional detection networks. To address these challenges, we propose a novel clustering-based feature space representation learning method. Our approach uniquely leverages the Oracle Bones Character (OBC) font library dataset as prior knowledge to enhance feature extraction in the detection network through clustering-based representation learning. The method incorporates a specialized loss function derived from clustering results to optimize feature representation, which is then integrated into the total network loss. We validate the effectiveness of our method by conducting experiments on two OBIs detection dataset using three mainstream detection frameworks: Faster R-CNN, DETR, and Sparse R-CNN. Through extensive experimentation, all frameworks demonstrate significant performance improvements.

artificial intelligence, knowledge, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.18641

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Add feedback

AeroLite-MDNet: Lightweight Multi-task Deviation Detection Network for UAV Landing

Yang, Haiping, Liu, Huaxing, Wu, Wei, Chen, Zuohui, Wu, Ning

arXiv.org Artificial IntelligenceJun-30-2025

--Unmanned aerial vehicles (UA Vs) are increasingly employed in diverse applications such as land surveying, material transport, and environmental monitoring. Following missions like data collection or inspection, UA Vs must land safely at docking stations for storage or recharging, which is an essential requirement for ensuring operational continuity. However, accurate landing remains challenging due to factors like GPS signal interference. T o address this issue, we propose a deviation warning system for UA V landings, powered by a novel vision-based model called AeroLite-MDNet. This model integrates a multiscale fusion module for robust cross-scale object detection and incorporates a segmentation branch for efficient orientation estimation. We introduce a new evaluation metric, A verage Warning Delay (A WD), to quantify the system's sensitivity to landing deviations. Furthermore, we contribute a new dataset, UA VLand-Data, which captures real-world landing deviation scenarios to support training and evaluation. Experimental results show that our system achieves an A WD of 0.7 seconds with a deviation detection accuracy of 98.6%, demonstrating its effectiveness in enhancing UA V landing reliability. NMANNED aerial vehicles (UA Vs), also known as drones, have been widely used in fire detection, geological hazard monitoring, and dangerous behavior monitoring [1] for their agility, compactness, and cost-efficiency. To reduce the dependency of UA Vs on human labor and skills, UA V nests are widely used to minimize manual operations, allowing the UA Vs to perform autonomous monitoring. UA V nests also offer functionalities such as safe parking, charging, data transmission, routine maintenance, repairs, and communication relays [2].

artificial intelligence, machine learning, segmentation, (20 more...)

arXiv.org Artificial Intelligence

2506.21635

Country:

Asia > China > Zhejiang Province > Hangzhou (0.05)
Asia > China > Beijing > Beijing (0.04)
Europe > Poland (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
(2 more...)

Add feedback

SO-DETR: Leveraging Dual-Domain Features and Knowledge Distillation for Small Object Detection

Zhang, Huaxiang, Zhang, Hao, Mei, Aoran, Gan, Zhongxue, Zhu, Guo-Niu

arXiv.org Artificial IntelligenceApr-17-2025

Detection Transformer-based methods have achieved significant advancements in general object detection. However, challenges remain in effectively detecting small objects. One key difficulty is that existing encoders struggle to efficiently fuse low-level features. Additionally, the query selection strategies are not effectively tailored for small objects. To address these challenges, this paper proposes an efficient model, Small Object Detection Transformer (SO-DETR). The model comprises three key components: a dual-domain hybrid encoder, an enhanced query selection mechanism, and a knowledge distillation strategy. The dual-domain hybrid encoder integrates spatial and frequency domains to fuse multi-scale features effectively. This approach enhances the representation of high-resolution features while maintaining relatively low computational overhead. The enhanced query selection mechanism optimizes query initialization by dynamically selecting high-scoring anchor boxes using expanded IoU, thereby improving the allocation of query resources. Furthermore, by incorporating a lightweight backbone network and implementing a knowledge distillation strategy, we develop an efficient detector for small objects. Experimental results on the VisDrone-2019-DET and UAVVaste datasets demonstrate that SO-DETR outperforms existing methods with similar computational demands. The project page is available at https://github.com/ValiantDiligent/SO_DETR.

artificial intelligence, detection, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2504.1147

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RepVGG-GELAN: Enhanced GELAN with VGG-STYLE ConvNets for Brain Tumour Detection

Balakrishnan, Thennarasi, Sengar, Sandeep Singh

arXiv.org Artificial IntelligenceMay-6-2024

Object detection algorithms particularly those based on YOLO have demonstrated remarkable efficiency in balancing speed and accuracy. However, their application in brain tumour detection remains underexplored. This study proposes RepVGG-GELAN, a novel YOLO architecture enhanced with RepVGG, a reparameterized convolutional approach for object detection tasks particularly focusing on brain tumour detection within medical images. RepVGG-GELAN leverages the RepVGG architecture to improve both speed and accuracy in detecting brain tumours. Integrating RepVGG into the YOLO framework aims to achieve a balance between computational efficiency and detection performance. This study includes a spatial pyramid pooling-based Generalized Efficient Layer Aggregation Network (GELAN) architecture which further enhances the capability of RepVGG. Experimental evaluation conducted on a brain tumour dataset demonstrates the effectiveness of RepVGG-GELAN surpassing existing RCS-YOLO in terms of precision and speed. Specifically, RepVGG-GELAN achieves an increased precision of 4.91% and an increased AP50 of 2.54% over the latest existing approach while operating at 240.7 GFLOPs. The proposed RepVGG-GELAN with GELAN architecture presents promising results establishing itself as a state-of-the-art solution for accurate and efficient brain tumour detection in medical images.

architecture, detection, repvgg-gelan, (15 more...)

arXiv.org Artificial Intelligence

2405.03541

Country: Europe > United Kingdom (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Sensing and Signal Processing > Image Processing (0.88)

Add feedback

Guarantee Regions for Local Explanations

Havasi, Marton, Parbhoo, Sonali, Doshi-Velez, Finale

arXiv.org Artificial IntelligenceFeb-20-2024

Interpretability methods that utilise local surrogate models (e.g. LIME) are very good at describing the behaviour of the predictive model at a point of interest, but they are not guaranteed to extrapolate to the local region surrounding the point. However, overfitting to the local curvature of the predictive model and malicious tampering can significantly limit extrapolation. We propose an anchor-based algorithm for identifying regions in which local explanations are guaranteed to be correct by explicitly describing those intervals along which the input features can be trusted. Our method produces an interpretable feature-aligned box where the prediction of the local surrogate model is guaranteed to match the predictive model. We demonstrate that our algorithm can be used to find explanations with larger guarantee regions that better cover the data manifold compared to existing baselines. We also show how our method can identify misleading local explanations with significantly poorer guarantee regions.

anchor box, dimension, explanation, (16 more...)

arXiv.org Artificial Intelligence

2402.12737

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Inner-IoU: More Effective Intersection over Union Loss with Auxiliary Bounding Box

Zhang, Hao, Xu, Cong, Zhang, Shuaijie

arXiv.org Artificial IntelligenceNov-14-2023

With the rapid development of detectors, Bounding Box Regression (BBR) loss function has constantly updated and optimized. However, the existing IoU-based BBR still focus on accelerating convergence by adding new loss terms, ignoring the limitations of IoU loss term itself. Although theoretically IoU loss can effectively describe the state of bounding box regression,in practical applications, it cannot adjust itself according to different detectors and detection tasks, and does not have strong generalization. Based on the above, we first analyzed the BBR model and concluded that distinguishing different regression samples and using different scales of auxiliary bounding boxes to calculate losses can effectively accelerate the bounding box regression process. For high IoU samples, using smaller auxiliary bounding boxes to calculate losses can accelerate convergence, while larger auxiliary bounding boxes are suitable for low IoU samples. Then, we propose Inner-IoU loss, which calculates IoU loss through auxiliary bounding boxes. For different datasets and detectors, we introduce a scaling factor ratio to control the scale size of the auxiliary bounding boxes for calculating losses. Finally, integrate Inner-IoU into the existing IoU-based loss functions for simulation and comparative experiments. The experiment result demonstrate a further enhancement in detection performance with the utilization of the method proposed in this paper, verifying the effectiveness and generalization ability of Inner-IoU loss. Code is available at https://github.com/malagoutou/Inner-IoU.

iou sample, loss function, regression, (14 more...)

arXiv.org Artificial Intelligence

2311.02877

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Vision (0.51)
Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback